Semidefinite Programmingfor Graph Partitioning with Preferencesin Data Distribution
نویسندگان
چکیده
Graph partitioning with preferences is one of the data distribution models for parallel computer, where partitioning and mapping are generated together. It improves the overall throughput of message traffic by having communication restricted to processors which are near each other, whenever possible. This model is obtained by associating to each vertex a value which reflects its net preference for being in one partition or another of the recursive bisection process. We have formulated a semidefinite programming relaxation for graph partitioning with preferences and implemented efficient subspace algorithm for this model. We numerically compared our new algorithm with a standard semidefinite programming algorithm and show that our subspace algorithm performs better. 1 The Graph Partitioning Problem and Parallel Data Distribution Graph partitioning is universally employed in the parallelization of calculations on unstructured grids, such as finite element and finite difference calculations, whether using explicit or implicit methods. Once a graph model of a computation is constructed, graph partitioning can be used to determine how to divide the work and data for efficient parallel computation. The goal of the graph partitioning problem is to divide a graph into disjoint subgraphs subject to the constraint that each subgraph has roughly equal number of vertices, and with the objective of minimizing the number of edges that are cut by the partitionings. In many calculations the underlying computational structure can be conveniently modeled as a graph in which vertices correspond to computational tasks and edges reflect data dependencies. The objectives here are to evenly distribute the computations among the processors while minimizing interprocessor communication, so that the corresponding assignment of tasks to processors leads to efficient execution. Therefore, we wish to divide the graph into subgraphs with roughly equal J.M.L.M. Palma et al. (Eds.): VECPAR 2002, LNCS 2565, pp. 703–716, 2003. c © Springer-Verlag Berlin Heidelberg 2003 704 Suely Oliveira et al. numbers of nodes with the minimum number of edges crossing between the subgraphs. Graph partitioning is an NP-hard problem [7]. Therefore, heuristics need to be used to get approximate solutions for these problems. The graph partitioning problem for high performance scientific computing has been studied extensively over the past decades. The standard approach is to use Recursive Bisection [9, 21]. In Recursive Bisection, the graph is broken in half, the halves are halved independently, and so on, until there are as many pieces as desired. The justification for traditional partitioning is that the number of edges cut in a partition typically corresponds to the volume of communication in the parallel application. Since communication is an expensive operation, minimizing this volume is extremely important in achieving high performance. In traditional recursive partitioning, after each step in a recursive decomposition the subgraphs are decoupled and interact no further. An edge crossing between two sets does not affect the later partitioning of either set. Consequently, there is nothing preventing the two adjacent vertices from being assigned to processors that are quite far from each other. A message between distant processors must traverse many wires, which are therefore rendered unavailable to transmit other messages. Conversely, if each message consumes only a small number of wires, more messages can be sent at once. In parallel computing, messages traveling between architecturally distant processors should be minimized by improving the data locality, since they tie up many communication links. Therefore, a good mapping is one that reduces message congestion and thereby preserves communication bandwidth. Many scientific computing applications of interest, for example those employing an iterative sparse solver kernel, have a structure in which many messages simultaneously compete for limited communication bandwidth. Good mappings are especially important in these cases. Recently, Hendrickson et al. [8, 9, 11] pointed out problems with traditional models. Assume we have already partitioned the graph into left and right halves, and that we have similarly divided the left-half graph into top and bottom quadrants (see Figure 1). When partitioning the right-half graph between processors 3 and 4, we want the messages to travel short distances. The mapping shown in the left-hand of Figure 1 is better since the total message distance is less than that for the right-hand figure. These models are obtained by associating to each vertex a value which reflects its net preference for being in one subgraph or another. Note that this preference is a function only of edges that connect the vertex to vertices which are not in the current subgraph. These preferences should be propagated through the recursive partitioning process. If the graph partitioning problem with preferences is relaxed as is done to obtain spectral graph partitioning, we obtain an extended eigenproblem: Find the minimum μ for which there is a y = 0 satisfying Ay = μy + g with a specified norm [11]. In [18] we developed subspace methods to solve extended eigenproblems. In [19] we have developed a subspace algorithm for a SDP of the original graph partitioning. In this paper we will develop a semidefinite program for graph SDP for Graph Partitioning with Preferences in Data Distribution 705 processor 4 processor 1 processor 2 processor 3 processor 4 processor 1 processor 3
منابع مشابه
A Subspace Semidefinite Programming for Spectral Graph Partitioning
A semidefinite program (SDP) is an optimization problem over n × n symmetric matrices where a linear function of the entries is to be minimized subject to linear equality constraints, and the condition that the unknown matrix is positive semidefinite. Standard techniques for solving SDP’s require O(n) operations per iteration. We introduce subspace algorithms that greatly reduce the cost os sol...
متن کاملSemidefinite spectral clustering
Multi-way partitioning of an undirected weighted graph where pairwise similarities are assigned as edge weights, provides an important tool for data clustering, but is an NP-hard problem. Spectral relaxation is a popular way of relaxation, leading to spectral clustering where the clustering is performed by the eigen-decomposition of the (normalized) graph Laplacian. On the other hand, semidefin...
متن کاملGraph bisection revisited
The graph bisection problem is the problem of partitioning the vertex set of a graph into two sets of given sizes such that the sum of weights of edges joining these two sets is optimized. We present a semidefinite programming relaxation for the graph bisection problem with a matrix variable of order n the number of vertices of the graph that is equivalent to the currently strongest semidefinit...
متن کاملA note on Fiedler vectors interpreted as graph realizations
The second smallest eigenvalue of the Laplace matrix of a graph and its eigenvectors, also known as Fiedler vectors in spectral graph partitioning, carry significant structural information regarding the connectivity of the graph. Using semidefinite programming duality we offer a geometric interpretation of this eigenspace as optimal solution to a graph realization problem. A corresponding inter...
متن کامل